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Abstract —We consider the problem of communicating the state 
of a dynamical system via a Shannon Gaussian channel. The 
receiver, which acts as both a decoder and estimator, observes 
the noisy measurement of the channel output and makes an 
optimal estimate of the state of the dynamical system in the 
minimum mean square sense. Noisy feedback from the receiver 
to the transmitter is present. The transmitter observes the noise- 
corrupted feedback message from the receiver together with a 
possibly noisy measurement of the state the dynamical system. 
These measurements are then used to encode the message to be 
transmitted over a noisy Gaussian channel, where a per symbol 
power constraint is imposed on the transmitted message. Thus, 
we get a mixed problem of Shannon’s source-channel coding 
problem and a sort of Kalman filtering problem. In particular, we 
consider two feedback instances, one being feedback of receiver 
measurements and the second being the receiver’s state estimates. 
We show that optimal encoders and decoders are linear filters 
with a finite memory and we give explicitly the state space realiza¬ 
tions of the optimal filters. For the case where the transmitter has 
access to noisy measurements of the state, we derive a separation 
principle for the optimal communication scheme. Furthermore, 
we investigate the presence of noiseless feedback or no feedback 
from the receiver to the transmitter. Necessary and sufficient 
conditions for the existence of a stationary solution are also given 
for the feedback cases considered. 


Notation 


X* 

X* = {x{0),x{l),...,x{t)). 

L 

The set of lower triangular matrices. 

B 

Denotes the backward shift operator, 

x{t — 1) = Bx(f). 

E{.} 

E{x} denotes the expected value of the 
stochastic variable x. 

E{.|.} 

E{x denotes the expected value of the 

stochastic variable x given y. 

cov 

cov{x,y} = 'E{xy'^}. 

h{x) 

Denotes the entropy of x. 

h{x\y) 

Denotes the entropy of x given y. 

I{x;y) 

Denotes the mutual information between 

X and y. 

Af{m, V) 

Denotes the set of Gaussian variables with 
mean m and covariance V. 


I. Introduction 


A. Background 

Many problems in practice require state estimation of a dy¬ 
namical system where the possibly noisy state measurements 
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Fig. 1. A simple model of an estimation problem of the state of the dynamical 
system H over a Gaussian communications channel with Gaussian noise 
n ~ A/'(0,A/^), Gaussian noise rif ~ for the feedback channel, 

the coloring filter S of the measurement noise n, and delay given by the 
backward shift operator B. The optimization parameters are given by the 
encoder G and the decoder F. The symbols of the encoder output 2 ; are 
power limited with E| 2 ;(f)p < P. 


at one end are transmitted over a noisy communciation another 
end where the state estmation is to be performed. 

Shannon im, ii considered the problem of reliable commu¬ 
nication of a one-dimensional source over a one-dimensional 
Gaussian channel. In particular, Shannon considered the fol¬ 
lowing coding-decoding setting for an analog Gaussian chan¬ 
nel: 

/Mr f^9{x) + n)\^ 

n9M\^<p 

where x ^ A/'(0,X), n ^ A/’(0,A), and f^g are arbitrary 
functions with E|^(x)p < P. Shannon showed that the 
infimum can be attained by using linear encoder and decoder g 
and /, respectively. The generalization of Shannon’s result to 
higher dimensions is still open and there are examples where 
linear coding and decoding strategies might not be optimal 

m 

An important generalization of Shannon’s AWGN channel 
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Fig. 2. A simple model of an estimation problem of the state of the dynamical 
system H over a Gaussian communications channel with Gaussian noise n ~ 
A/'(0, N) and delay given by the backward shift operator B. The optimization 
parameters are given by the encoder G and the decoder F. The samples of 
the encoder output 2 ; are power limited with Ei\z{t)\‘^ < P. 


is the case when the message x to be estimated is the state 
of a given linear dynamical system driven by process noise. 
For instance, this problem arises in video-streaming over a 
wireless channel. A video stream consists of highly correlated 
information described by a dynamical system due to the 
correlation between the sequential picture frames. This is an 
instance of the general MIMO communcation problem with 
causality constraints, which adds structure to the problem. 
Another generalization is when the measurement noise is 
colored with the coloring filter given by a linear filter S, see 
Figure [] for an illustration of the generalized communication 
system. 

More specifically, consider the block-diagram in Fig. We 
have the process noise given by re, which is assumed to be 
Gaussian white noise, and the state is given hy x = Hw where 
H is a causal linear operator/filter. 

The precoder is given by the causal operator G, not necessarily 
linear. The encoded signal ^ = Gx is then transmitted over 
a Gaussian channel with white noise given by n. Typically, 
one has power constraints on the transmitted signal z(t), that 
is < p, for some positive real number P. At the 

other end, the message received is y{t) = z{t) n{t), for 
t = 0,...,T — 1, and is delayed with d time steps by the 
backward shift operator B. Finally, the causal operator F is 
the decoder, designed to reconstruct the state x by X = FBy, 
to minimize the mean squared error E|ep = E|x — xp. 

For the case where G is a fixed linear operator, the optimal 
filter F is well known to be given by the optimal Kalman filter, 
which is a linear operator. However, if G is a precoder to be 
co-designed together with F, we get a nonconvex problem 
even if we restrict the optimization problem to be carried out 
over linear operators/filters. To this date, it’s not known if 
linear filters are optimal, and whether the order of the linear 
optimal filters is finite for the general MIMO case. 
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Fig. 3. A simple model of a filtering problem over a Gaussian communica¬ 
tions channel with noiseless feedback. 


B. Previous work 

Kalman n made a fundamental contribution to optimal 
control and filtering of linear dynamical systems by deriv¬ 
ing recursive state space solutions. The model considered 
by Kalman assumes given linear measurements of the state, 
possibly partial and corrupted by noise. The solution relies on 
an orthogonality prinicple, where the filter update is based 
on an innovations process representing information that is 
orthogonal to the state estimate of the filter. 

The problem of optimal state estimation used for control 
of scalar dynamical systems was considered in fD, where 
noiseless feedback of the measurements at the receiver is 
present at the transmitter(see Figure and it was shown 
that linear filters where optimal. The role of a communication 
channel with feedback and its effect on stability was studied in 
|[6l and necessary conditions for stability were given for linear 
time-invariant channels and that for time-varying channels 
was given in Cl- Fundamental limitations of performance 
with sensitivity functions as a measure were studied in m. 
The problem of communication and filtering over a noisy 
channel for the stationary case has been considered in O 
where it was shown that this problem can be transformed to a 
convex optimization problem that grows with the size of the 
time horizon. However, the order of the linear optimal filters 
obtained from (91 is infinite. 

In another direction, cni studied the problem of source- 
channel coding over a communciation channel with colored 
noise with the correlation given by a linear filter S, as depicted 
in Figure Here, the filter H is the identity (so x = w), 
V = 0, and G encodes the information given by w by using 
information of the measurements (with delay d = 1) at the 
receiver through noiseless feedback. Although the problem 
in oni considered maximizing the channel capacity, it was 
equivalent to the problem of minimizing the mean squared 
error of the state estimate as shown in Figure Also here, 
the solution relied on a sort of orthogonality principle where 
the transmitted information is orthogonal to that available at 
the receiver. 
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Fig. 4. A simple model of a filtering problem over a Gaussian communica¬ 
tions channel with noiseless feedback. 


In im, preliminary results (with incomplete proofs) were 
given for the special case of communication and estimation 
without feedback for the scalar case as depicted in Figure 
1^ In all previous work, except lO, m, ifTlI . average power 
constraints were assumed. Per symbol power constraints were 
considered in 0 , ( 3 , uni. 


C Contributions 

We consider the linear dynamical system H given by 

x(t + 1) = ax(t) + hw(t) 
x(0) = xq, 0 < t < T — 1. 

The main contributions of this paper is to derive the struc¬ 
ture and explicit expressions of the optimal communication 
schemes as described in figures and and respectively, 
where noisy feedback is present from the receiver side to 
the transmitter. We show that the optimal filters F and G 
are linear and have a finite memory independent of the size 
of the time horizon. In particular, we consider per symbol 
power constraints on the transmitter signal as opposed to the 
average power constraints considered in the literature. We 
show explicitly that the state space realizations of the optimal 
filters (for the case of full state measurement at the trasnmitter 
with delay at the receiver given by ^ = 1) are given by 

' s{t -h 1) = as{t) -b K{t){z{t) + n{t)) 

^ ^ x{t) = x{t) — s{t) 

z{t) = —x{t), 

I cr* 

F : x{t -h 1) = ax{t) + K{t)y{t) 


with En2(t) = N, Enf(t) = Nu = Ex^{t), K{t) = 
aatVP{P + N)~^, and s(0) = 0. 

The interpretation of the state space equations is the follow¬ 
ing. s{t) = 'E{x{t)\x^ is the estimate at the transmitter 
of the estimate x{t) at the decoder. The transmitter’s estimate 
of e{t) is x(t) = 'E{e{t)\x^,y\~^} = x{t) — s{t). This 
estimate is then transmitted over the Gaussian channel, in order 
to supply the decoder with the innovations(the incremental 
information the decoder needs to correct its estimate of x{t)). 

We show that the error e{t) may be stationary if and 
only if |a| < 1. Then, we consider the filtering problem 
over a communication channel, where noiseless feedback is 
introduced from the channel output to the precoder as depicted 
in Figure We show that the optimal transmitter and receiver 
are given by 

x(t) = ax{t) + K(t)y{t) 
x{t) = x{t) — x{t) 


z{t) = —x{t), 

(T't 


( 1 ) 


with 


K{t) = a 


(Jt'/P 


P + N’ 

and = E|x^(t)p given by (Jq = Exq = I4cc(0) and 
o N 


a. = 


2 2 I 

a (Ji_i + 0 


N^P 

Furthermore, we show that the error variance ef is bounded 
as t ^ oo if and only if 

log2(|a|) < C 


where C is the capacity of the Gaussian channel from the 
transmitter to the receiver which is similar to previously 
published results in the context of stabilization of control 
system over communication channels O. We also consider 
the problem of communication under noisy feedback of the 
decoder’s state estimates at the transmitter (see Figure [7]). We 
find explicitly the optimum filter pair which is given by 


x{t + 1) = ax{t) + K{t)y{t) 

x{t -h 1) = aN{P -f N)~^x{t) -h x{t + 1) — ax{t) 

+ - Ht) - Viit)) 

z{t) = —x{t), 
o-t 

where = Ex^(t) and = Ex^(f) — Ex^(t) — We 
show that the estimation error is bounded as t ^ oc if and 
only if the there exists a solution to the systems of nonlinear 
equations 


(7^ = 


2^2 




(P + A^)2 




and 


_2 a^PN ^ 
~ + ^ (P +Ar)2^ 


The above equations are equivalent to a system of fourth order 
polynomial equations in two variables which can be solved 
efficiently using standard numerical tools. 
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11. Preliminaries 


Definition 1: The entropy of a real-valued stochastic vari¬ 
able X with probability distribution p{x) is defined as 

/ oo 

p{x)\og2P{x)dx 

-OO 

Definition 2: For two real valued stochastic variables X 
and Y, the conditional entropy of X given Y is defined as 

h{X\Y) = h{X,Y)-h{Y). 

Definition 3: The mutual information between X and Y is 
defined as 

I{X,Y) = h{X) - h{X\Y) = h{Y) - h{Y\X). 

Proposition 1 (Entropy Power Inequality): If X and Y are 
independent scalar random variables, then 

22/i(x+y) ^ 2^^^^^ 
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with equality if X and Y are Gaussian stochastic variables. 

Proof: See lEl, p. 674 - 675. ■ 

Definition 4: Random variables X, F, Z are said to form a 
Markov chain in that order if the conditional distribution of Z 
depends only on Y and conditionally independent of X. This 
is denoted hy X ^Y Z. 

Proposition 2 (Data-Processing Inequality): If 


Fig. 5. A simple model of an estimation problem of the state of the dynamical 
system H over a Gaussian communications channel with Gaussian noise n ~ 
A/'(0, N), Gaussian noise rif ~ ^(O, A^f) for the feedback channel, and delay 
given by the backward shift operator B. The optimization parameters are given 
by the encoder G and the decoder F. The samples of the encoder output 2 ; 
are power limited with a peak power constraint given by 'E\z{t)\‘^ < P. 


X ^Y ^ Z, 


then 

I{X;Z)<I{Y-,Z). 

Proof: See El, p. 34-35. ■ 

Proposition 3: Let X and Y be two stochastic variables. 
The optimal solution to the optimization problem 

infE|X-/(y)P 

/(•) 

is unique and given by the expectation of X given Y 
MY) = E{X\Y}. 

Furthermore, /★(F) and X — f^{Y) are uncorrelated. 

Proof: Consult ((TSl, p. 237). ■ 

Proposition 4: Consider the stochastic variables X and F, 
and let the estimation error of X based on F be given by 

X = X-E{X|F}. 

Then, 

i log 2 det (27reE{V2}) > h{X\Y) = h{X) (2) 

with equality if and only if X and F are jointly Gaussian. 
Proof: Consult ifTHl . p. 21. ■ 
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Fig. 6. A simple model of an estimation problem of the state of the dynamical 
system H over a Gaussian communications channel with Gaussian noise n ~ 
^(O, N), Gaussian noise rif ~ ^(O, A^f) for the feedback channel. We Here, 
we have feedback from the reciever side to the transmitter side in terms the 
reciever measurement y{t). 
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Fig. 7. A simple model of an estimation problem of the state of the dynamical 
system H over a Gaussian communications channel with Gaussian noise n ~ 
A/'(0, A/'), Gaussian noise Uf ~ A/'(0, A^f) for the feedback channel with 
feedback information given by the receiver’s state estimates x{t). 


III. Problem Formulation 

We will consider the problem for the case S = /, as depicted 
in Figure 

Let H be a first order linear time invariant dynamical system 
with state-space realization 


x{t -h 1) = ax{t) + bw{t), x(0) = xq, 0 < t < T — 1, 

(3) 

where a, 6 G M, Exq = 14a: (0), and w is assumed to be white 
Gaussian noise with w{t) ^ A/'(0,1) for all 0 < t < T — 1. 

The measurements at the decoder are given by ^(0) := 0 
and 

y{t) = z{t) + n(f), for t > 1, 

where ^ is the transmitter signal and n is a white Gaussian 
noise process with n{t) ^ The decoder is a map 

given by F : x{t). Without loss of generality, we will 

assume throughout that 6 = 1 sls the approach to the general 
case 6 >1 is similar. 

The transmitter receives the noisy feedback measurements 

yf{t) = 0(t) + nf(t), for t > 1, 

where Uf is a white Gaussian noise process with nf{t) ^ 
A/’(0,A/f). The encoder is a map given by G : {x^^y^~^) i-A 
z{t). We also have a per symbol power constraint on the 
transmitted signal z{t) given by 'E\z{t)\‘^ < P. 

The objective is to design causal precoder and decoder maps 
G \ {x^^yl ^ z{t) and F : ^ i-A x{t), respectively, such 

that the average of the mean squared error 

- x{t)f 

t=l 


is minimized. The precoder and decoder maps can be equiv¬ 
alently written as a causal dynamical system according to 

z{t) = gt{x\z*~'^,yj~^) 
y{t) = z(t) + n{t) 
yf{t) = 4>{t) + nf{t) 

m = My^-^) 

where gt is the precoder and ft is the decoder. 

Problem 1: Consider the linear system 

x{t -h 1) = a{t)x{t) + b{t)w{t)^ 

x(0) = xq, 0 < f < T — 1, where a{t),b{t) G M, Exq = 
14a:(0), and w is white Gaussian noise with w{t) r\j V(0,1), 
0 < t < T —1. Let n and Uf be white Gaussian noise processes 
independent of each other and of w, with n{t) ^ A/’(0, N) and 
nf{t) ^ A/’(0, A^f). Find an optimal precoder and decoder pair 
such that 

T 

^^E\x{t)-xit)f 

t=l 

is minimized, where ^(0) = 0. 

Note that we haven’t given the form of the function (j). We 
will consider two cases of interest here, the first one being 
(j){t) = y{t), and the second one 0(t) = x{t), as depicted in 
figures and [7] respectively. 


IV. Main Results 

A. The Finite-Horizon Filtering problem with Receiver-Output 
Feedback 

The first result of this paper presents the structure of the op¬ 
timal precoder and decoder for the case where a noisy version 
of the receiver-output, y{t), is available at the transmitter. 

Theorem 1: Consider Problem with a{t) = a, b{t) = 6, 
and (j){t) = y(t). The optimal communication scheme is given 
by 

x(t)=E{x(t)|^^-i} 
x{t) = x{t) — x{t) 

xit) = E{x{t)\x\yl^} ( 5 ) 

z{t) = —x{t), 

O't 

where = E|x(t)p, for t = 1,..., T. 

Proof: See the Appendix. ■ 

Theorem 2: Consider Problem with a{t) = a, b(t) = b, 
and 0(t) = y{t). The state space realization of the optimal 
communication scheme is given by 


x{t + 1) = ax{t) -I- K{t)y{t) 

s{t -h 1) = as{t) -b K{t){z{t) + n{t)) 

x{t) = x{t) — s{t) 

Vp 

z{t) = - x{t), 

o-t 

where s(0) = 0, Vss(O) = Via;(0) = 0, 


( 6 ) 
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K{t) = a(JtVP{P ^ N)-^ (7) 

=yxx{'t)-2Vsx{t) + Vss{t) ( 8 ) 


+ 1 ) 
yxs{'^ + 1 ) 


+ 1 ) 
Kccc(t + 1 ) 


- 

~ K‘^{t)N^ 

0 


AT+ATf 

- 

0 

6^ 


+ 


r aN 
P+N 


0 


aP 1 
P+N 

a 


Xsit) 

yxs{t) 


ysx{t) 

yxx{t) 


r aN 
P+AT 

0 


aP 1 T 

P+AT 

a 


( 9 ) 


Proof: See the Appendix. 


B. Time-Varying Systems 

The results considered so far treated the case where the 
state stems from a linear time invariant system. It’s straight 
forward to verify that the results hold when we replace the 
parameters static a, 6 , P, Nf with time varying parameters 
a(f),6(f),P(f),7V(f),7Vf(f). 


C. Separation Principle for Optimal Communication 
Consider the linear system 

x{t + 1 ) = ax{t) + bw{t) 

7 (f) = cx{k) + dv{t) 

for 0 < f < T — 1 , with x( 0 ) = xq, Exq = 14 cc( 0 ), and 
is white Gaussian noise process with a given 
covariance. We assume now that the transmitter does’t have 
access to the state x{t) but 7 (f) instead. We get the following 
problem. 

Problem 2: Consider the linear system 

x{t + 1 ) = ax{t) + bw{t) 

7 (f) = cx{k) + dv{t) 

0 < t < T— 1 , where a, 6 G M, x( 0 ) = xq, Exq = 14 cc( 0 ), 

and 


wit) 

wit) 

T 

Vww i^) 

Vwvit) 

yit)_ 

yit)_ 


Vvwit) 

Vvvit) _ 


is given for 0 < f < T. Let n and Uf be white Gaussian 
noise processes independent each other and of w, with n(t) ^ 
A/’(0, N) and nfi) ^ A/’(0, Nf). Find an optimal precoder and 
decoder pair 

y{t) = z(t) + n(t) 
yf{t) = 4>{t) + nf{t) 

m = My^N 

such that ^ 

^y2nx{t)-xitp 

t=i 

is minimized, where ^( 0 ) = 0 . 

The optimal transmission scheme is for the transmitter to 
find the best estimate of x{t) based on 7 ^, namely x{t) = 
E{x(f)| 7 ^}, and then use this estimate as the state to be 


transmitted using the optimal communication scheme for the 
case of full state measurement at the transmitter given by ([T^. 

Theorem 3: The state space realization of the optimal com¬ 
munication scheme solution of Problem with fif) = y{t) is 
given by 


x{t + 1) = ax{t) + K{t)y{t) 

s{t + 1) = as{t) + K{t){z{t) + n{t)) 

x{f) = x{t) — s{t) 

z{t) = —x{t), 

O't 

where s(0) = 0, V,M = 14 ^( 0 ), %(0) = 14 ^( 0 ), 


( 11 ) 


Lit) = V^^it)cicN^^it) + dNUt)V 
%(t + l) = (a-ai:(t)7T45(i) 


+ [6 


-aLit)] 


^ww (^) 

Vvw{t) 


Vwv(t) 

Vvvit) 


[b 


-aL{t)y 


— L^(t + l)(c^V^^(f + 1) + d‘^Vyy{t + 1)) 


K{t) = aaty/P{P P N)-^ (12) 

=yxx{t)-2Vs,c{t)^Vss{t) (13) 


Vss{t + 1 ) 
yxs{t + 1 ) 


ysx{t + 1 ) 
yxx{t + 1 ) 


- 

AT+ATf ^ 

— 

0 /32(i)_ 


r aN 
P+AT 

0 


aP 

P+AT 

a 



Xsit) 

VsM 

r aN 
pyN 

aP 1 
pyN 


y^vsit) 

Vxxit)_ 

0 

a 


(14) 


Proof: The proof is deferred to the appendix. ■ 


D. No Feedback 

A special case is when no feedback is available from 
the receiver to the transmitter. This is equivalent to letting 
Aff ^ oo, or setting = 0, as depicted in Figure]^ This will 
simply imply that n = 0 and n = n, and thus, we obtain the 
optimal communication scheme that was previously obtained 
in HD. The case of no feedback is very delicate, since it does 
not possess the property of communicating information that is 
orthogonal to the information available at the receiver. 

Corollary 1: The state space realization of the optimal 
communication scheme solution of Problem with = 0 
is given by 

x{t + 1) = ax{t) + K{t)y{t) 
s{t -h 1) = as{t) + K{t)z{t) 

x{f) = x{t) — s{t) (15) 

z{t) = —x{t), 

<Xt 

where s( 0 ) = 0 , 14 s( 0 ) = 14 a;( 0 ) = 0 , 
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K{t) = a(Jt^/P{P + N)-^ 

erf = - 2Vs:g{t) + Vss{t) 


Vss{t + 1) Vs^{t+1) 

Vxsit + 1) Vxxlt + 1) 


aN 

P+N 

0 


aP 

P+N 

a 


Vss{t) Vsx{t) 
Vxsit) Vxxlt) 


■ 

■ aN 

aP - 


P+N 

P+N 


0 

a 


(16) 

(17) 


( 18 ) 


= x{t) + E |x(t) ^^^x(t) + ^(^)| 


= x{t) + 


P + N 




<TtVP 
P + N 


n{t), 


and 


x{t\t) : = x{t) — x{t\t) 

= x{t) + x{t) — x{t\t) 
N . at^/P 


P + N 


P + N 


Also, -( [ 20 | ) give 

x{t + 1) = E{x(t + 1)1^^} 

= E{ax{t) + bw{t)\y^} 
= ax{t\t), 


and 


x{t “hi) — x{t “hi) — x(t H- 1 ) 
= ax{t\t) + bw{t) 


N ^ . (JtVP . . 7 / X 


P^N 


a <1. 


consider is the dynamics of the variance of the estimation 
error x^{t) as follows. 


E{x^(t|t)} = E 



N atVP 

E{Pit)}+(j^^ EK(i)} 


erfP 


E. Noiseless Feedback 

Another interesting special case, which has been solved in 
na, is when we have perfect feedback from the receiver to the 
transmitter, as depicted in Figure We will reproduce this 
result using our approach, and furthermore, give necessary and 
sufficient conditons for the estimation error to be bounded for 
the case \a\ > 1 . 

Let 

x{t\t) : = E{x{t)\y*} 

= E{x{t) + x{t)\y^~^, z{t) + n{t)} 


{P + NV * 

N , 


{P + Nf 


N 


P + N * 


N 


P + N 


E{P{t)} 


Equations (22) and (23) give 


E{P{t + 1)} = +E{P{t\t)} + b‘^E{w^{t)} 
^ -aPE{x\t)} + b\ 


(23) 


(24) 


N + P 

The recurrence equation ( [24| ) implies that a stationary 
solution to Problem 1 for the case nf = 0 exists if and only if 

N , 


(19) 


1 > 


PPN' 


which is equivalent to 

I f P 
log 2 (|a|) < -log 2 + 

Note that the capacity C of the Gaussian channel is given by 


c = ^ log 2 




( 20 ) 


( 21 ) 


( 22 ) 


PPN PPN 

By considering the state estimation error dynamics in ^22) , the 
reader might be tempted to conclude that the decoder will be 
able to track the state x{t) if and only if 

N 


so a necessary and sufficient condition for the mean squared 
estimation error to be finite is 

log 2 (|a|) < C. 

A similar result for stabilization of a control system over a 
discrete memoryless channel has been obtained in El. 

E Stationarity 

In this section, we will present conditions under which a 
stationary solution exists to Problem[^for the case (j){t) = y{t) 
(that is a solution as T ^ cx)). Let x{t) = x{t) — x{t) be the 
estimation error of x{t) and consider the state space equations 
of the optimal estimate. After some algebra, we get the 
state space equations for the estimation error (see (^) in the 
proof of Theorem]^ in the Appendix): 

x{t + 1) = ax{t) — an{t)y{t) + bw{t) 

Vp 

= ax{t) — aN{t) - x{t) — aN{t)n{t) + bw{t) 

CTt 

aP 

= - K{t)n{t) + bw(t) 

nP 

= “ D , Ar (^(^) - - K{t)n{t) + bw{t) 


However, this conclusion is erroneous since the gain of the 
noise n{t) depends on at = ^/'E{x‘^{t)}. What we need to 


aN 

PAN 


PAN' 
x{t) — K{t)n{t) + bw{t) 


aP 

PAN 


x{t) 




































with 


x{t + 1) = ax(t) — K(t)n{t). 


Now suppose that there is a stationary solution to (28) 
Then, 


Obviously, for n{t) ^ 0(that is Nf > 0), the state x{t) can be 
stationary if and only if |a| < 1. In addition, in order for x{t) 
to be stationary, we must have 


1 > 


N 


P +AT' 


and 


2^2 






(P + A ^)2 + 

a^PN 


-2 . 

(y — _r. ——(y + 


(P + Ar)2 


(30) 

(31) 


Clearly, the inequality above is always fulfilled for \a\ < 1. 
We conclude the result above: 

Theorem 4: Problem with (j){t) = y{t) has a stationary 
solution for A/f > 0 as T oo if and only if |a| < 1 and 
there are no filters F and G that achieve a finite mean square 
error for \a\ > 1. 


The pair of equations are equivalent to a couple of forth 
order polynomial equations in the two variables (cr^, a^), and 
solving these equations can be found easily using standard 
numerical tools. 

V. Conclusions 


It’s interesting to see the difference between the noiseless 
feedback case and the noisy feedback one. This raises the 
question of whether the feedback function (j) could be chosen 
differently in order to get filters that can track a state as the 
time horizon goes to infintiy. Indeed, this turns out to be the 
case as will be shown in the sequel. Noiseless feedback of 
the output, (/)(t) = y{t), makes the state estimates at the 
receiver available to the transmitter. This would equivalently 
correspond to the case of noiseless feedback of the state 
estimates, that is for nf = 0 and 0(t) = x{t) as shown in 
Figure |7] 

G. Noisy Feedback of the State Estimates 

Suppose that the receiver transmits its state estimates x{f) 
back to the transmitter overa noisy channel. Being inspired by 


We considered the problem of optimal encoder/decoder filter 
design over a Shannon Gaussian channel with noisy feedback 
to estimate the state of a scalar linear dynamical system. We 
showed that optimal encoders and decoders are linear filters 
with a finite memory and we give explicitly the state space 
realization of the optimal filters. We also presented the solution 
of the case where the transmitter has access to noisy measure¬ 
ments of the state. We derived a separation principle for this 
communication scheme. Necessary and sufficient conditions 
for the existence of a stationary solution where also given. 

Future work will consider the case where the noise process 
n is colored for some linear filter S I. Also, the non-scalar 
case is challenging as we can’t rely on the information theo¬ 
retic inequalities used in this paper for the higher dimensional 
case. 
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Appendix 

Proof of Theorem 

Suppose that = ctt where are deter¬ 

ministic real numbers independent of X* and are known at the 
encoder gt and decoder ft. Note that y{t) = gt{x^)^n{t). The 
estimate of + 1 ) based on y{k), A: = 0 ,t, is the same as 
the estimate of x(t+l) based on y{k)—ak for /c = 0 ,t since 
ak is deterministic and known at the decoder. But it means 
that we can replace gt{x^) with g't{x^) = g{x^) — ctt, and 
g't{x^) satisfies both ^{g[{x^)} = 0 and the power constraint 
< P since 

E\g[{x^)f = E\gt{x^)-a\^ 

= E\gtix*)f-a^ 

= P-a^ <P. 

Thus, without loss of generality, we may restrict the encoders 
g to the set 

{g I E{fl(x‘)} = 0 }. 

We will now prove that the optimal filters are linear by 
induction. Suppose that gk-i and fk-i are linear for k = 
1,..., t. Then, P, x^, and yl~^ are jointly Gaussian. 

Let x{t\t) = fl{y^) be the optimal estimate of x{t) based 
on y^ and let x{t\t) = x{t) — x{t\t), for t = 0, ...,T. Then, 
ftiy^) — '^{^{t)\y^} according to Proposition]^ Now we have 
that 

x{t\t) = E{x{t)\y*} 

= E{{x{t) + ^t)\y*} ( 32 ) 

= x{t) + 'E,{x{t)\y*}, 

x(f -|- 1 ) = x(f “hi) — x(f -|- 1 ) 

= ax{t) + bw{t) — ax{t\t) (33) 

= ax{t\t) + bw{t) 

We see that minimizing E\x{t-\-l)\‘^ is equivalent to minimiz¬ 
ing the mean square error of 


together with Proposition gives 


I{x{t)-,y{t)) <I{gt{x\y^ );y{t)). 


(34) 


The Shannon capacity of a Gaussian channel gives an upper 
bound for the mutual information between the transmitted 
message z{t) = gt{x^,yf~^) and received message y{t) (see 

03 ): 


< -log 2 


I{9t{x\y*f ^ 

Combining ([^-([^, we get 


1 




N 


(35) 


(36) 


P^N 

with equality if x(f) and y{t) are mutually Gaussian and 
gt{x^^y\~^) = ^x{t) with = E\x{t)\‘^. From the 
definition of mutual information, we have that 

h{x{t)\y{t)) = h{x{t)) - I{x{t);y{t)). (37) 

Now we get 

2‘^h(x(t)\y^) 

22hixit)\yit)) 

22 h(x(t)-\-x(t)\y(t)) 

22 h(x(t)\y(t))-\-h(x{t)) 

22h(x(t)\y(t)) 

22 h{x{t))- 2 I{x{t)]y{t)) 

N 


27reE{|x(t|t)|^} > 


> 


(40) 

(41) 

2‘^h{x{t)) ^ 42 ) 

> _^^_2‘^h(x(t)) _^2‘^Hx(t)) ^ 42 ^ 

where (38) follows from Proposition with equality if x{t) 
and y^ are jointly Gaussian), ([^ follows from the fact that 
x{t) is independent of y^~^, (WM follows from the fact that 


x{t) is independent of x{t) and y{t), (41) follows from the 
entropy power inequality(Proposition eT© follows from 
equation ( [TT] ), and ( [43] ) follows from inequality ( |36| ). Further¬ 
more, equality holds in ([38|)-(|43]) if 


z(t) = 9tix\yi b = —x{t) 

with at = E|x(t)p. This completes the proof. 


x{t\t) = x{f) - E{x{t)\y^} 

at the decoder. Now introduce 

x{t) := P.{x{t)\x\y\~'^} 

and 

x{t) \= x{t) — x{t). 

Then, x{t) is a linear function of x^ and yl~^, since x(t), 
x^, and yl~^ are jointly Gaussian by the induction hypothesis. 
Thus, x(f) is independent of x(f), x^, and yj~^. This implies 
that x{t) is independent of gt{x^,yf~^) and gt{x^,yf~^) + 
n{t) = y{t). 

The Markov chain 

x{t) gt{x\yj~^) y{t) = gt{x\y\~^) P n{t), 


Proof of Theorem [^ 

Let x(t) = E{xif)\y^~^\ , x(t) = xif) — x(t), x(t\t) = 
E{x{t)\y^}, and x{t\t) = x{t) - x{t\t). 

Then, 

x{t -h 1) = ax{t\t) 

= aE{x(i) + x{t)\y*} (44) 

= ax{f) + aE{x{t)\y{t)} 

and 

x{t -h 1) = ax{t) — aE{x{t)\y{t)} + bw{t). (45) 
According to Theorem the optimal signal 2 ; is given by 
x{t) = E{x{t)\x\y^^~^} 

z{t) = 

(Tt 
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with o-f = E|a;(t)|^. Now recall that y{t) = z{t) + n{t), 
x{t) = 'Ei{x{t)\x ^, x{t) = x{t) — x{t) , and x{t) 
is orthogonal to re* and hence to y{t). Since x(t) and y{t) 
are jointly Gaussian, 'E{x{t)\y{t)} is a linear function of y{t) 
given by 

E{x{t)\y{t)} = E{x{t) + x{t)\y{t)} 

= E{rf(t)|y(t)} + E{x{t)\y{t)} 

= E{rr(t)|i/(f)} 

= cov{x{t),y{t)}{cov{y{t),y{t)})~^y{t) 

= Kit)y{t) 

(46) 

with 


i{t) = atVP{P + N)-\ 


Then, (44)-(46) imply 


x{t + 1) = ax{t) + aK{t)y{t) 

.Vp. 


= ax{t) + aK{t) - x{t) + aK{t)n{t), 

cft 


and 


N + Nf 

h{t) = n{t) — n{t) 


N ^ ^ N + 


We will show that 




x{t + 1) + x{t + 1) 


,Vp. 


= ax{t) — aK{t) - x{t) — aK{t)h{t) + hw{t) 


+ ax{t) — aK{t)n{t) 


.Vp. 


(47) 


= a{x{t) + x{t)) — aK{t) - x{t) 

— an{t){n{t) + n{t)) + hw{t) 

Vp 

= ax{t) — an(t) - x{t) — an(t)n(t) + bw{t), 

(54) 

which is exactly the expression for the dynamics of x{t 1) 
given by ([4^. This establishes ([5^ - ([53]). Now we have 


(55) 


(48) 


x{t) = E{x(t)\x\yj 

= E{;r(f) +rr(f)|rr*,yf“^} 

= E{x{t)\x\yl^} + E{x{t)\x\yl^} 
= E{x{t)\x\y*f'^} + x{t). 


x{t + 1) = ax{t) — aK{t)y{t) + bw{t) 

Vp 

= ax(t) — aK,{t) - x{t) — aK,{t)n{t) + bw{t) 

(49) 

The encoder has access to P at time t + 1. It has also access 
to and y\, which implies that it has access to 

yi{k) - z{k) = n{k) + nf{k) 

for k = Now we have that 

n{t) = E{n(t)|n(t) + ^f(^)} 

^ (n(f) + nf(f)) 


From equation ( [48] ), we see that 

E{x{t)\x\y\~^] = s{t) 

where 

Vp 

s{t + 1) = as{t) + aK{t) - x{t) + aK{t)h{t) 

= as{t) + K{t){z{t) + h(t)) 


(56) 


(57) 


since the noise signal n is independent of x and yf. Finally, 
combining @-g7j gives 


Now set 


Then, 


x{t) = x{t) — s{t). 
K{t) = aK{t). 

,Vp. 


(58) 


(59) 


Nf N (51) 


s{t + 1) = as{t) + aK{t) - x{t) + aK{t)h{t) 


= as{t) + a 


P 

aP 


■N 


x{t) + K{t)n{t) 


x{t + 1) = ax{t) — an{t) - x{t) — aK{t)h{t) + bw{t) (52) 

o't 

and 

x{t + 1) = ax{t) — aK,{t)n{t) (53) 

First we note that x as defined in ([5^ depends only on the 


= a - 


aN 


P + N 


P-^N 
s{t) -f 


s{t) + 
aP 

P-^N 


aP 

P + N 


x{t) + K{t)h{t) 


x{t) + K(t)n{t) 


(60) 


and 


s(t + I) 


r aN 
P+AT 

aP 1 
P+AT 

s{t) 

4- 

Kit) 

o' 

'hit) 

x(t + I) 


0 

a 

_x{t)_ 


0 

b_ 

_wit)_ 


Introduce the covariance matrix 


channel noise estimation error h and is therefore independent 

'Vssit) 

VsM 

= E 

sit) 

sit) 

of X, yf, and x. Now (52)-(53) give 

yut) 

Vxxit)_ 

_xit)_ 

_xit)_ 
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Since n{t) and w{t) are uncorrelated with x{t) and s(t), we 
get 


.v;. 

= E 


= e{( 


(t + l) Vs. 

{t + 1 ) Vxx 

s[t + 1 ) 

x(t + 1) 

aN 


P+AT 

0 

aN 

P+AT 

0 


AT+ATf 

0 


(t + 1 ) 
(t + 1 ) 

s(t + 1 ) 
x{t - 

aP 


1 ). 


P+AT 

a 

aP 


r aN 

aP 1 

- 

P+AT 

P+AT 


0 

a 



P+AT 

a 


s{t) 
x{t) 

s{t) 
x{t) 

(t) Vsx 
(t) Vxx 


+ 


K{t) 

0 

K{t) 

0 


(^) 

(^) 


0 


h{t)' 
w{t) 

wlt)\) J 


- 

r aN 

aP 1 


P+AT 

P+AT 


0 

a 


( 61 ) 


Thus, 


P,=E\x{tp 

= E\x{t)-s{tp (62) 

= Vxx{t) — 2Vsx(t) + Vxxit). 


Putting together ^T\ , ( [dS) ), and ( [ST) ) - ( |62| gives the desired 
result. 


measurement problem to a state measurement problem at the 
encoder G, where the measured state is the state x{t) of the 
linear time-varying dynamical system given by 

x(t -h 1) = x(t -h l|t) + L{t + l)(c^(t + 1) + dv{t + 1)) 

= ax{t) H- L{t -f l)(c^(t + 1) + dv{t + 1)) 

= ax{t) -h 

with uj{t) A/’( 0 , 1 ) and 

p\t) = L\t + l)E{(ce(t + 1) + dv{t + 1 ))"} 

= L^it + + 1 ) + d?Vyy{t + 1 )) 

Inserting h(t) = in Problemand using Theorem]^ gives 
the (p^-(p^. This concludes the proof. 

Proof of Theorem 

Similar to Theorem]^ we have that 

x{t -h 1) = ax{t) -1- K{t) - x{t) K{t)n{t)^ (64) 

cft 


Vp 

x{t -h 1) = ax{f) — K{t) - x{f) — K{t)n{f) + hw{f) 

CTt 


(65) 


with, as before. 


Proof of Theorem 

Define the estimate x{t\t — 1) = 'E{x{t)\Y~^} and let 
C{t) = x{t) - x{t\t - 1 ) 


be the estimation error. It’s well known that x{t) is given by 
the Kalman filter 

x{t) = x{t\t — 1) + L{t){c^{t) + dv{t)) 
x{t + l\t) = ax{t) 

= ax{t\t — 1) + aL{t){c^{f) + dv{f)) 

-h 1) = (a — aL(t)c)^{t) + hw{t) — aL(t)dv{f) 


where L{f) are the optimal Kalman filter gains for t = 0,..., T 
(see, e. g., |[T 6 l): 


L{t) = V^S)c{cN^^{t) + d?VxN))-^ 
+ 1 ) = (a - aL{t)cfV^^{t) 


+ [5 


-aL{t)\ 


Vww {^) 
Vvwit) 


Vwv{t) 

Vvv{t) 


[b 


-aL{t)Y 


We also know that 7 *“^ and ^(t) are uncorrelated according 
to Proposition This implies in turn that and ^{t) 

are uncorrelated. Hence, the averaged estimation error of the 
decoder is equal to 


K{t) = aatVPiP + N)-^ 

(j| = Ei^(t) ( 66 ) 

Clearly, 

x{t) = 'E{x{t)\x\y\~Y 

= 'E{x{t)\x^~^,w{t - l),y\~Y 
= E{ax{t - l)\x*~'^,w{t - l),y\~Y 

— K{t — 1) x{t — 1) + bw{t — 1) 
o't-i 

= ax{t — 1) + 'E{ax{t — l)\x^~^ ^w{t — 1), 

Vp 

— K{t — 1)- x{t — 1) + bw{t — 1) 

c^t-i 

(67) 

The transmitter can consctruct the new measurement 

x{t — 1 ) — x{t — 1 ) — Hfit — 1 ) = x{t — 1 ) + Pf(t — 1 ), 


so 

E{ax{t - l)\x*~^,w{t - l),y\~Y 

= E{ax{t — l)\x^~^ ,w{t — 1 ), y\~‘^,x{t — 1 ) + nf(t — 1 )} 

Since x{t — l) and nf(t — 1) are independent of x^~^, — 

and we have that 


^'^E\x{t)-x{tp = ^'^{E\x{t) - x{tp +E\^{tp) . 

t=l t = l 

Obviously, the decoder can’t do much about the error 
covariance E|^(t)p. The decoder F minimizes the averaged 
estimation error above if and only if it minimizes the averaged 
estimation error of x{t). Thus, we have transformed the output 


E{ax{t-l)\x^ ^,w{t-l),yf 
= 'E{ax{t — l)\x{t — 1) + nft — 1)} 

Let = Ex2(t) = E(x(f) - x(f))2 = Ex‘^{t) - Ex‘^{t). 
Then, 

E\ax(t)\x(t) +nf(t)| 

= «7(7 +A',)-(*(*) + »,«)) * 
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Equation ( [ 68 ] ) together with ( [67] ) gives 

Vp. 


x(t + 1 ) = ax{t) — K{t) - x{t) + hw{t) 

+ aa^{a^ + Nf)~^{x{t) + nf{t)) 

= aN{P + N)-^x{t)+hw{t) 

+ aCTj (ctj + Nf)~^{x{t) + nf{t)) 
and we can verify that 

x{t + 1) = ax{t) — K{t)n{t) 

- aaf{af + Nf)~^{x{t) + nf{t)) 

= aNf{af + Nf)-^x{t) - K{t)n{t) 

- aa^{a^ + N{)~'^n{{t) 

Now by substituting hw{t) = x{t + 1) — ax{t) and x{t) 
n{{t) = x{t) — x{t) — yf{t) in ( |^ , we get 

x{t + 1) = aN{P + N)~^x{t) + x{t + 1) — ax{t) 


(69) 


(70) 


+ ao-t (o-t + {x{t) - x{t) - yf{t)) 


(71) 


The dynamical system equations given by ( [ 6 ^ and ( |70l i give 
varianc 

a?N‘^ 


the dynamics of the variance values <t^ and 


't+i 


{P + N) 


2 "t 


and 


7^"^ — 

^t+l — 


a'^Nf 2 


at + Ni 
a^PN 


+ Nf * (P + N) 


2'^t 


(72) 


(73) 







